May 27th, 2020

Profiling

  • summary of the times spent in different function calls
  • memory usage report

Pi calculation

\(\textrm{Surface circle} = \left ( \frac{\textrm{Surface circle}}{\textrm{Surface square}} \right ) * (\textrm{Surface square})\)

is always valid. Knowing that \(\textrm{Surface circle} = \pi * r^2\), \(\pi\) can be computed as:

\(\pi = \frac{1}{r^2} \left ( \frac{\textrm{Surface circle}}{\textrm{Surface square}} \right ) * (\textrm{Surface square})\)

the ratio in parentheses is approximated with a Monte Carlo process throwing random points

Pi calculation

Surface ratio

  • The R function to compute Pi is:
sim <- function(l) {
 c <- rep(0,l)
 hits <- 0
 pow2 <- function(x) {
   x2 <- sqrt( x[1]*x[1]+x[2]*x[2] )
   return(x2)
 }
 for(i in 1:l){
   x = runif(2,-1,1)
   if( pow2(x) <=1 ){
     hits <- hits + 1
   }
   dens <- hits/i
   pi_partial = dens*4
   c[i] = pi_partial
 }
 return(c)
}

Pi calculation

The accuracy of the calculation increases with the number of iterations

size <- 100000
res <- sim(size)
plot(res[1:size],type='l', xlab="Nr. iterations", ylab="Pi")
lines(rep(pi,size)[1:size], col = 'red')

Monitoring the execution time

System.time

This function is included in R by default

size <- 500000
system.time(
 res <- sim(size)
)
##    user  system elapsed 
##    1.52    0.00    1.59

Monitoring the execution time

Tic toc

Another way to obtain execution times is by using the tictoc package:

install.packages("tictoc")

one can nest tic and toc calls and save the outputs to a log file:

Monitoring the execution time

Tic toc

library("tictoc")
size <- 1000000
sim2 <- function(l) {
   c <- rep(0,l)
   hits <- 0
   pow2 <- function(x) { x2 <- sqrt( x[1]*x[1]+x[2]*x[2] );  return(x2) }
   tic("only for-loop")
   for(i in 1:l){
      x = runif(2,-1,1)
      if( pow2(x) <=1 ){ hits <- hits + 1 }
      dens <- hits/i; pi_partial = dens*4; c[i] = pi_partial
   }
   toc(log = TRUE)
   return(c)
}

Monitoring the execution time

Tic toc

tic("Total execution time")
    res <- sim2(size)
## only for-loop: 2.86 sec elapsed
toc(log = TRUE)
## Total execution time: 2.87 sec elapsed

Monitoring the execution time

Tic toc

tic.log()
## [[1]]
## [1] "only for-loop: 2.86 sec elapsed"
## 
## [[2]]
## [1] "Total execution time: 2.87 sec elapsed"
tic.clearlog()

Rprof

Rprof should be present in your R installation. For a graphical analysis, we will use proftools package. One needs to install this package in case it is not already installed. For R versions < 3.5 the instructions are:

install.packages("proftools")
source("http://bioconductor.org/biocLite.R")
biocLite(c("graph","Rgraphviz"))

while for R > 3.5 one needs to do

install.packages("proftools")
if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install()
BiocManager::install(c("graph","Rgraphviz"))

Rprof

the profiling is performed with the following lines:

size <- 500000
Rprof("Rprof.out")
res <- sim(size)
Rprof(NULL)

Rprof

the profiling is performed with the following lines:

summaryRprof("Rprof.out") 
## $by.self
##         self.time self.pct total.time total.pct
## "runif"      0.68    48.57       0.68     48.57
## "sim"        0.40    28.57       1.40    100.00
## "pow2"       0.32    22.86       0.32     22.86
## 
## $by.total
##                       total.time total.pct self.time self.pct
## "sim"                       1.40    100.00      0.40    28.57
## "block_exec"                1.40    100.00      0.00     0.00
## "call_block"                1.40    100.00      0.00     0.00
## "eval"                      1.40    100.00      0.00     0.00
## "evaluate"                  1.40    100.00      0.00     0.00
## "evaluate::evaluate"        1.40    100.00      0.00     0.00
## "evaluate_call"             1.40    100.00      0.00     0.00
## "handle"                    1.40    100.00      0.00     0.00
## "in_dir"                    1.40    100.00      0.00     0.00
## "knitr::knit"               1.40    100.00      0.00     0.00
## "process_file"              1.40    100.00      0.00     0.00
## "process_group"             1.40    100.00      0.00     0.00
## "process_group.block"       1.40    100.00      0.00     0.00
## "rmarkdown::render"         1.40    100.00      0.00     0.00
## "timing_fn"                 1.40    100.00      0.00     0.00
## "withCallingHandlers"       1.40    100.00      0.00     0.00
## "withVisible"               1.40    100.00      0.00     0.00
## "runif"                     0.68     48.57      0.68    48.57
## "pow2"                      0.32     22.86      0.32    22.86
## 
## $sample.interval
## [1] 0.02
## 
## $sampling.time
## [1] 1.4

Rprof

here you can see that the functions runif and pow2 are the most expensive parts in our code. A graphical output can be obtained through the proftools package:

library(proftools)
## Warning: package 'proftools' was built under R version 3.6.3
p <- readProfileData(filename = "Rprof.out")

Rprof

plotProfileCallGraph(p, style=google.style, score="total")

Rbenchmark

One most probably needs to install this package as it is not included by default in R installations:

install.packages("rbenchmark")

then we can benchmark our function sim()

library(rbenchmark)
size <- 500000
bench <- benchmark(sim(size), replications=10)
bench 
##        test replications elapsed relative user.self sys.self user.child
## 1 sim(size)           10   14.61        1     14.49     0.02         NA
##   sys.child
## 1        NA

Rbenchmark

bench 
##        test replications elapsed relative user.self sys.self user.child
## 1 sim(size)           10   14.61        1     14.49     0.02         NA
##   sys.child
## 1        NA

the elapsed time is an average over the 10 replications we especified in the benchmark function.

Microbenchmark

If this package is not installed, do as usual:

install.packages("microbenchmark")

and do the benchmarking with:

library(microbenchmark)
## Warning: package 'microbenchmark' was built under R version 3.6.3
bench2 <- microbenchmark(sim(size), times=10)

Microbenchmark

bench2 
## Unit: seconds
##       expr      min       lq     mean   median      uq      max neval
##  sim(size) 1.394034 1.413121 1.555204 1.540621 1.67281 1.847704    10

in this case we obtain more statistics of the benchmarking process like the mean, min, max, …

References